Modern Time Series Forecasting with Python, Second Edition by Manu Joseph Jeffrey Tackes
Author:Manu Joseph, Jeffrey Tackes
Language: eng
Format: epub
Tags: BUS086000 - BUSINESS & ECONOMICS / Forecasting, MAT029040 - MATHEMATICS / Probability & Statistics / Stochastic Processes, COM037000 - COMPUTERS / Machine Theory
Publisher: Packt
Published: 2024-10-25T12:34:17+00:00
Now, letâs look at what happens inside an RNN.
Let the input to the RNN at time t be xt and the hidden state from the previous timestep be Ht-1. The updated equations are as follows:
Here, U, V, and W are learnable weight matrices, and b1 and b2 are two learnable bias vectors. U, V, and W can be easily remembered as input-to-hidden, hidden-to-output, and hidden-to-hidden matrices based on the kind of transformation they perform, respectively. Intuitively, we can think of the operation that the RNN does as a kind of learning and forgetting information as it sees fit. The tanh activation, as we saw in Chapter 11, Introduction to Deep Learning, produces a value between -1 and 1, which acts analogous to forgetting and remembering. So the RNN transforms the input into a latent dimension, uses the tanh activation to decide what information from the current timestep and previous memory to keep and forget, and uses this new memory to generate an output.
In standard backpropagation, we backpropagate gradients from one unit to another. But in recurrent nets, we have a special situation where we have to backpropagate the gradients within a single unit, but through time or the different timesteps. A special case of backpropagation, called Back Propagation Through Time (BPTT), has been developed for RNNs.
Thankfully, all the major deep learning frameworks are capable of doing this without any problems. For a more detailed understanding and the mathematical foundations of BPTT, please refer to the Further reading section.
PyTorch has made RNNs available as ready-to-use modulesâall you need to do is import one of the modules from the library and start using it. But before we do that, we need to understand a few more concepts.
The first concept we will look at is the possibility of stacking multiple layers of RNNs on top of each other so that the outputs at each timestep become the input to the RNN in the next layer. Each layer will have a hidden state or memory. This enables hierarchical feature learning, which is one of the bedrocks of successful deep learning today.
Another concept is bidirectional RNNs, introduced by Schuster and Paliwal in 1997. Bidirectional RNNs are very similar to RNNs. In a vanilla RNN, we process the inputs sequentially from start to end (forward). However, a bidirectional RNN uses one set of input-to-hidden and hidden-to-hidden weights to process the inputs from start to end, and then it uses another set to process the inputs in reverse (end to start) and concatenate the hidden states from both directions. It is on this concatenated hidden state that we apply the output equation.
Reference check:
The research papers by Rumelhart. et al and Schuster and Paliwal are cited in the References section as 5 and 6, respectively.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Computer Vision & Pattern Recognition | Expert Systems |
Intelligence & Semantics | Machine Theory |
Natural Language Processing | Neural Networks |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8063)
Hadoop in Practice by Alex Holmes(5791)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5640)
Test-Driven Development with Java by Alan Mellor(4997)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(4880)
Data Augmentation with Python by Duc Haba(4839)
Principles of Data Fabric by Sonia Mezzetta(4646)
Learn Blender Simulations the Right Way by Stephen Pearson(4433)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(4409)
Big Data Analysis with Python by Ivan Marin(4394)
Functional Programming in JavaScript by Mantyla Dan(3874)
RPA Solution Architect's Handbook by Sachin Sahgal(3805)
The Age of Surveillance Capitalism by Shoshana Zuboff(3654)
The Infinite Retina by Robert Scoble Irena Cronin(3537)
Pretrain Vision and Large Language Models in Python by Emily Webber(3378)
Infrastructure as Code for Beginners by Russ McKendrick(3171)
Deep Learning with PyTorch Lightning by Kunal Sawarkar(3144)
Blockchain Basics by Daniel Drescher(3077)
The Rosie Effect by Graeme Simsion(2913)